Potential Impact of PI3K-AKT Signaling Pathway Genes, KLF-14, MDM4, miRNAs 27a, miRNA-196a Genetic Alterations in the Predisposition and Progression of Breast Cancer Patients

Simple Summary The genomic landscape of breast cancer (BC) is complex. Previous research studies have not extensively elucidated the correlation of genotypes and allele variations of the PI3K, AKT-1, KLF-14, MDM4 and miRNAs 27a, miR-196a genes with the predisposition of Breast cancer in Saudi Arabia. Therefore, to cover this area of research, we conducted a case-control study on 230 subjects (115 cases and 115 controls). Genotyping was studied by using the ARMS-PCR and results were confirmed by Sanger sequencing. The novel and known gene variants were studied by Whole-exome sequencing using Illumina NovaSeq 6000 platform. Strong association was reported between the PI3K-AKT signaling pathway genes and KLF 14-AA, MDM4-GA, miR27a-GG and miR-196a-CT gene variants with the breast cancer susceptibility and progression. The results could help to classify and identify those at risk for Breast cancer in the future. WES for provide insight toward disease mechanisms for the development of more effective therapies. Abstract Genome-wide association studies have reported link between SNPs and risk of breast cancer. This study investigated the association of the selected gene variants by predicting them as possible target genes. Molecular technique advances with the availability of whole-exome sequencing (WES), now offer opportunities for simultaneous investigations of many genes. The experimental protocol for PI3K, AKT-1, KLF-14, MDM4, miRNAs 27a, and miR-196a genotyping was done by ARMS-PCR and sanger sequencing. The novel and known gene variants were studied by Whole-exome sequencing using Illumina NovaSeq 6000 platform. This case control study reports significant association between BC patients, healthy controls with the polymorphic variants of PI3K C > T, AKT-1 G > A KLF 14 C > T, MDM4 A > G, miR-27a A > G, miR-196a-2 C > T genes (p < 0.05). MDM4 A > G genotypes were strongly associated with BC predisposition with OR 2.08 & 2.15, p < 0.05) in codominant and dominant models respectively. MDM4 A allele show the same effective (OR1.76, p < 0.05) whereas it remains protective in recessive model for BC risk. AKT1G > A genotypes were strongly associated with the BC susceptibility in all genetic models whereas PI3K C > T genotypes were associated with breast cancer predisposition in recessive model OR 6.96. Polymorphic variants of KLF-14 A > G, MDM4G > A, MiR-27aA >G, miR-196a-C > T were strongly associated with stage, tamoxifen treatment. Risk variants have been reported by whole exome sequencing in our BC patients. It was concluded that a strong association between the PI3K-AKT signaling pathway gene variants with the breast cancer susceptibility and progression. Similarly, KLF 14-AA, MDM4-GA, miR27a-GG and miR-196a-CT gene variants were associated with the higher risk probability of BC and were strongly correlated with staging of the BC patients. This study also reported Low, novel, and intermediate-genetic-risk variants of PI3K, AKT-1, MDM4G & KLF-14 by utilizing whole-exome sequencing. These variants should be further investigated in larger cohorts’ studies.

in different genetic studies revealing its significant role as a primary gene expression regulator [22]. KLF 14 has been shown to be a novel tumour suppressor and is often downregulated in human cancers, demonstrating its role as an important biomarker for disease progression and for developing new cancers treatments [23,24]. Meta-analysis and genome wide studies examining the effect of polymorphic variants found that rs972283 polymorphism in KLF14 has high risk of developing diseases with the G allele associated with T2D, metabolic disorders in different populations and in another study with A allele which is associated with polycystic ovary syndrome in specific populations [25][26][27]. For T2D and breast cancer susceptible groups with the KLF-14-rs972283 variant association found to be weak among European and African population [28]. However, this variant is highly attributed to patient with genetic predisposition for T2D and with breast or prostate cancer [29]. Breast cancer frequently experiences oncogenic activation of the phosphatidylinositol-3-kinase (PI3K), protein kinase B (PKB/AKT), and mammalian target of rapamycin (mTOR) pathway, which promotes tumor development, disease progression, and therapy resistance. Recent studies suggest that the intricate interactions between the PI3K-AKT-mTOR pathway and several interacting cell signaling cascades can enhance the advancement of Breast cancer. The PIK3R1 gene encodes the PIK3R1 protein.
PI3K is an important protein in the Akt signaling pathway which play important role in cell survival, differentiation, growth, glucose trafficking, and utilization. Glu545Lys (rs104886003); His1047Tyr (rs121913281) mutations induce confirmation change in the phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit alpha (PIK3CA) protein ( Figure 1). endometrial cancer, and stomach cancer [16]. In breast malignancy, several SNPs, and mutations in MDM4 were reported to promote breast growth including the rs11801299 G > A polymorphisms. MDM4 rs11801299 G > A polymorphisms has been reported to be connected with retinoblastoma susceptibility [17], risk of gastric cancer in Chinese population [18], and risk of breast cancer [19]. In 2018, an Iranian study reported an association between the MDM4 rs11801299 G > A polymorphisms and the susceptibility to breast tumor [20].
The Krüppel-like transcription factors (KLF) are a family of 17 transcription factors which are involved in the modulation of several genes that are essential for different cellular processes [21]. Recently, the importance of KLF14, also called BTEB5, was postulated in different genetic studies revealing its significant role as a primary gene expression regulator [22]. KLF 14 has been shown to be a novel tumour suppressor and is often downregulated in human cancers, demonstrating its role as an important biomarker for disease progression and for developing new cancers treatments [23,24]. Meta-analysis and genome wide studies examining the effect of polymorphic variants found that rs972283 polymorphism in KLF14 has high risk of developing diseases with the G allele associated with T2D, metabolic disorders in different populations and in another study with A allele which is associated with polycystic ovary syndrome in specific populations [25][26][27]. For T2D and breast cancer susceptible groups with the KLF-14-rs972283 variant association found to be weak among European and African population [28]. However, this variant is highly attributed to patient with genetic predisposition for T2D and with breast or prostate cancer [29]. Breast cancer frequently experiences oncogenic activation of the phosphatidylinositol-3-kinase (PI3K), protein kinase B (PKB/AKT), and mammalian target of rapamycin (mTOR) pathway, which promotes tumor development, disease progression, and therapy resistance. Recent studies suggest that the intricate interactions between the PI3K-AKT-mTOR pathway and several interacting cell signaling cascades can enhance the advancement of Breast cancer. The single nucleotide variations rs104886003 G > A (Glu545Lys) and rs121913281 C > T (His1047Tyr) are shown in green surface presentation catalytic as shown in Figure 1. This figure is prepared using YASARA, and modified from Elfaki et al. [30].  Figure 1) is prepared using YASARA, and modified from Elfaki et al. [30].
MicroRNAs are small, single stranded, non-coding RNAs. Most of them are down expressed in cancer cells which guides to cellular transformation followed by tumor development and cancer progression. The miR-27a-3p is a single stranded, non-coding RNAs which is recognized as an oncogenic RNA in multiple malignancies including colorectal, Breast and gastric cancer. MiR-27a is mostly located in exosomes of the BC cancer cells [30][31][32]. It is reported that miR-27a-3p (exosomal) foster immune attach by activating PD-L1 via MAGI2/PTEN/PI3K axis in breast cancer [33]. The miR-27a rs895819 A/G is a common SNP that is found in the loop of the pre-miRNA and may have a key role in mi-R27a maturation [34,35]. The minimum free energy (MFE) is affected when there is a variation from A to G in this SNP, and as a result, the function of the miRNA might be affected to some extent. Several reported studies showed the possibility of this variant allele in decreasing the risk of cancer, while however other studies showed the opposite [36]. In one of the meta-analysis, miR-27a rs895819 gene polymorphism has been reported to be linked with the breast cancer susceptibility among Caucasians. Besides AA genotype of miR-27a conferred strong link with breast cancer susceptibility and AG genotype (heterozygous) as well as G-allele were protective factors [37].
Recent studies have shown that miR-196b acts as a tumor suppressor in various cancer types [38] in HepG2 cells that suppressed cell proliferation and induced apoptosis [39]. One of the most common SNP reported is miR-196a-rs11614913 C to T which may increase or decrease the extent of translation of the target protein therefore may alter the expression as well as the functions which in turn can increase cancer susceptibility [40]. MiR-196a has been reported to target several genes that may be involved in cell cycle, apoptosis, and differentiation. MiR-196a targets annexin-A1 (ANXA1) gene which controls the physiological mechanisms such as exocytosis, hormone secretion, apoptosis, and signal transduction [41]. Some studies have been performed to establish the link between miR-196a (rs11614913) polymorphism and breast cancer risk [42]. Some studies report it as a protective factor [43] whereas others like Omrani et al. and Qi et al. indicated strong association with the breast cancer susceptibility [44,45].
The human genome's protein-coding regions, which contain around 85% of the diseasecausing variations, can almost entirely be covered by whole-exome sequencing (WES) [46]. The exome makes up around 1% of the entire human genome, making WES an incredibly potent tool for medical genetic research. Future case/control and family based NGS research will be more effective thanks to WES, which is frequently more affordable and permits the sequencing of more individuals [47]. More recently, 65 loci strongly linked with breast cancer were found by a genome-wide association study [48].

Materials and Methods
This study was conducted on primary BC patients (n = 115) and gender matched healthy women (n = 115) with no history of any type of cancer and who were not related to the patients from the general population. Formalin fixed paraffin embedded (FFPE) tissue specimens were obtained from the Division of Histopathology, King Salman Armed Forces Hospital North-western Region at Tabuk city and other hospitals. This study handled patient samples and records in accordance with the Declaration of Helsinki revised in 2013 under the ethical approval of Armed Forces Hospital Research Ethics Committee (KSAFH-REC-2020-345/8 September 2020) and the ethics committee of the University of Tabuk (protocol code UT-115-13-2020/24 April 2020). All patients obtained written informed consent form. The study experiment was carried out at Genome and Biotechnology Unit, Faculty of Science, University of Tabuk.
Inclusion criteria of patients: The study included clinically confirmed cases of breast cancer patients who were Saudi women. Specimens were collected from primary breast cancer patients who had been diagnosed based on the clinical, histopathological, and radiological findings. The study also included breast cancer cases who had received chemotherapy, hormonal therapy, and radiotherapy.
Exclusion criteria: The exclusion criteria included (1). Breast cancer patients diagnosed with multiple cancer types, (2). Patients who were unable to cope with the study protocol, (3). Non-Saudi women with breast cancer, (4). Any breast cancer patient with a history of previous significant malignancy.
Inclusion criteria for healthy controls A healthy control cohort was prepared from the participants visiting for routine checkup to King Fahd Special Hospital, Tabuk, Saudi Arabia. These participants completed the informed consent form and filled questionnaire. The inclusion criteria included (1) Gender matched healthy women, (2) Women with age equal or greater than 40 years (≥40), (3) Only ethnic Saudi women.
Exclusion criteria for healthy controls The exclusion criteria included healthy women with a family history of breast cancer, non-ethnic Saudi women, women with age below 40 years of age.

Demographic Data
Each breast cancer patient filled out a standardized questionnaire about their demographics, family history, and previous knowledge. To determine relevant clinical history, detailed laboratory and clinical data were collected.

DNA Extraction from Cases and Controls
The FFPE samples were collected from the pathology department and some peripheral blood specimens were also obtained by venipuncture and placed in EDTA tubes after assessing the clinicopathological findings. According to the manufacturer's instructions, DNA was extracted from the specimens (FFPE) using the QIAamp DNA FFPE Tissue Kit (Cat-56404) from Qiagen (Hilden, Germany). Similarly, blood specimens were processed for genomic DNA extraction using the DNeasy Blood Kit (Cat-Noc69506) from Qiagen (Germany). The extracted DNA was dissolved in water devoid of nucleases and kept at 4 • C until needed.
The genomic DNA quality and integrity were checked using NanoDropTM (Thermo Scientific, Waltham, MA, USA), and the extracted DNA's quality was investigated optically using the A260 nm/A280 nm ratio (1.83-1.99).

Gel Electrophoresis and PCR Product Visualization
The PCR-amplified products were separated on 2% agarose gel electrophoresis, stained with sybre safe dye, and visualized under UV transilluminator from Bio-Rad, Hercules, CA, USA.
AKT-1 rs1130233 G > A gene variation: The AKT-1 outer region was amplified by the outer primers FO and RO, yielding a band of 466 bp that served as a DNA purity check. A band of 213 bp was produced by the primers FI and RO amplifying the G allele, and a band of 298 bp was produced by the primers FI and RO amplifying the A allele (Supplementary Figure S1). Sanger sequencing primers Phosphatidylinositol 3-kinase (PI3K) rs121913281 C > T gene variation: The two forward primers and common reverse primer sequence are shown in (Table 1). Primers F2T and common reverse primers amplify the (TT genotype), that is, the mutant-type allele (364 bp). Primers F1C and the common reverse primer generated a band (364 bp) for the wild type allele (CC genotype) as depicted in Supplementary Figure S2. KLF-14 rs972283 G > A gene variation: The KLF14 s outer region was amplified by the outer primers F1 and R1, producing a band of 437 base pairs that serves as a DNA purity control. A band of 221 base pairs (bp) was produced by primers F1 and R2 amplifying the A allele, and a band of 274 bp was produced by primers F2 and R1 amplifying the G allele. Results were confirmed by Sanger sequencing (Figure 2A-C).
MDM4 rs11801299 A > G gene variation: The exterior primers A band of 468 bp was produced after Fo and R0 amplified the MDM4 s outer region, serving as a DNA purity checker. A band of 223 bp was produced by the A allele by primers FI and Ro, and a band of 301 bp was produced by the G allele by primers Fo and RI (as depicted in Supplementary Figure S3). The results were confirmed by Sanger sequencing as depicted in Figure 3A,B.
MiR-27a rs895819 A > G gene variation: The miR-27 s outer region was amplified by the outer primers FO and RO, yielding a band of 353 bp that served as a DNA purity check. A band of 226 bp was produced by primers FI and R2 amplifying the A allele, and a band of 184 bp was produced by primers F2 and R1 amplifying the G allele (as depicted in Supplementary Figure S4).
MiR-196a2 rs11614913 C > T gene variation: The miR-196a2 exon was flanked by primers FO and RO, which resulted in a 297 bp band that served as a DNA purity check. A band of 153 base pairs (bp) was produced by primers FO and RI amplifying the C allele, and a band of 199 bp was produced by primers FI and RO amplifying the T allele (as depicted in Supplementary Figure S5). allele (364 bp). Primers F1C and the common reverse primer generated a band (364 bp) for the wild type allele (CC genotype) as depicted in Supplementary Figure S2. KLF-14 rs972283 G > A gene variation: The KLF14′s outer region was amplified by the outer primers F1 and R1, producing a band of 437 base pairs that serves as a DNA purity control. A band of 221 base pairs (bp) was produced by primers F1 and R2 amplifying the A allele, and a band of 274 bp was produced by primers F2 and R1 amplifying the G allele. Results were confirmed by Sanger sequencing (Figure 2A MDM4 rs11801299 A > G gene variation: The exterior primers A band of 468 bp was produced after Fo and R0 amplified the MDM4′s outer region, serving as a DNA purity checker. A band of 223 bp was produced by the A allele by primers FI and Ro, and a band of 301 bp was produced by the G allele by primers Fo and RI (as depicted in Supplementary Figure S3). The results were confirmed by Sanger sequencing as depicted in  MDM4 rs11801299 A > G gene variation: The exterior primers A band of 468 bp was produced after Fo and R0 amplified the MDM4′s outer region, serving as a DNA purity checker. A band of 223 bp was produced by the A allele by primers FI and Ro, and a band of 301 bp was produced by the G allele by primers Fo and RI (as depicted in Supplementary Figure S3). The results were confirmed by Sanger sequencing as depicted in Figure 3A

MiR-27a rs895819 A > G gene variation:
The miR-27′s outer region was amplified by the outer primers FO and RO, yielding a band of 353 bp that served as a DNA purity check. A band of 226 bp was produced by primers FI and R2 amplifying the A allele, and a band of 184 bp was produced by primers F2 and R1 amplifying the G allele (as depicted in Supplementary Figure S4).

MiR-196a2 rs11614913 C > T gene variation:
The miR-196a2 exon was flanked by primers FO and RO, which resulted in a 297 bp band that served as a DNA purity check. A band of 153 base pairs (bp) was produced by primers FO and RI amplifying the C allele, and a band of 199 bp was produced by primers FI and RO amplifying the T allele (as depicted in Supplementary Figure S5).

Sanger Sequencing for the Confirmation of Genotyping Results
To confirm the genotyping results of KLF-14 rs972283 G > A, MDM4 rs11801299 A > G, PI3K-rs121913281 C > T AKT-1 rs1130233 G > A, and miR-196a2 rs11614913 C > T detected by ARMS-PCR, 20 randomly selected PCR products from the PCR systems for polymorphic sites in these gene were sequenced using Sanger sequencing. Two primers F

Sanger Sequencing for the Confirmation of Genotyping Results
To confirm the genotyping results of KLF-14 rs972283 G > A, MDM4 rs11801299 A > G, PI3K-rs121913281 C > T AKT-1 rs1130233 G > A, and miR-196a2 rs11614913 C > T detected by ARMS-PCR, 20 randomly selected PCR products from the PCR systems for polymorphic sites in these gene were sequenced using Sanger sequencing. Two primers F seq and R Seq were used as sequencing primers as depicted in Table 1 for the detection of the genotyping in the above genes. The PCR amplification was done followed by purification using QIAquick PCR Purification Kit from Qiagen (Germany). Finally, the purified PCR products were sequenced by Applied Biosystems sequencer.

Whole Exome Sequencing
For whole exome sequencing, DNA was extracted from peripheral blood using standard Qiagen nucleic acid isolation kits. The library was prepared as per the instruction manual of the Twist 2.0 Exome kit and sequencing was performed using the Illumina NovaSeq 6000 platform as per the user manual. The sequencing reads QC was carried out using FastQC v0.11.9. Raw reads were filtered to remove sequencing adapters and low-quality bases using TrimGalore v0.6.6 software. High quality (HQ) reads thus obtained were mapped on the hg38 human reference genome; variant calling (single nucleotide variants (SNVs), small InDels) was done with the Genome Analysis Toolkit (GATK) v4.2.4.1 software best practice pipeline using haplotype caller.
Variant annotation was carried out using different databases and tools. The RefSeq database was used for identification and characterization of genes associated variants. The disease association for variants was derived using databases like OMIM and ClinVar. The population frequency information from 1000 genomes, ExAC, GnomAD exome, GnomAD genome, and ESP, was used for elimination of common variants/polymorphism. For prediction of implication of coding non-synonymous SNVs on the structure and function of protein, PolyPhen-2 and the SIFT score was used. Furthermore, all variants were separately analyzed by multiple prediction tools for in-silicon variant effect prediction. All variants were then interpreted based on the ACMG guidelines (PMID:25741868) and variants classified as pathogenic, likely pathogenic, and variant of uncertain significance were reported. Some results are presented in the Supplementary Tables S1-S3.

Statistical Analysis
All statistical analyses were determined using Med-Calc software, version 20.027 (medcalc.org/calc/odds_ratio.php)/SPSS 16.0 (SPSS, Inc., Chicago, IL, USA) as well as statistical software version 9.4 (SAS Institute, Inc., Cary, NC, USA) and Stata statistical software (StataCorp. 2013. Release 13. College Station, TX, USA). Deflection from Hardy-Weinberg disequilibrium (HWD) was determined by Chi-square (χ 2 ) 'goodness of fit test'. A p-value < 0.05 is observed as statistically significant difference. The distribution and association of PI3K rs121913281 C > T, AKT-1 rs1130233 G > A, KLF 14 (rs972283 C > T), MDM4 rs11801299 A > G, miRNAs 27a rs895819A > G, and miR-196a-2 rs11614913 C > T alleles and genotypes between the groups was determined by Chi-square test. To assess the association between the risk of BC and the genotypes of KLF 14 (rs972283 C > T), MDM4 (rs11801299 A > G), miRNAs 27a (rs895819A > G), and miR-196a-2 (rs11614913 C > T), we generated odds ratios (ORs), risk ratios (RRs), and risk differences (RDs) with 95% confidence intervals (CIs). The OR was calculated by dividing the probabilities in the first group by the odds in the second group. Table 2 provides an overview of the demographic characteristics of the 115 breast cancer patients who were treated in succession. Complete clinical information was available for 100/115 breast cancer cases. However, BC patients were divided into two groups according to their age, with those over 40 (n = 75, 65.2%) and those under 40 (n = 40, 34.8%). Of them, 30/100 (30%) BC cases were in early stage (stage I and II) and 70/100 (70%) cases were in advanced stage (stages III and IV) of breast cancer. The BC cases accounted for 70 (70%) cases: 10 (10%), 30 (30%), and 60 (60%) of BC were, respectively, in grades I, II, and III according to the histological classification. According to the receptor status, 60 patients (60%) were positive for the estrogen receptor, 70 patients (70%), and 43 patients (43%), respectively, for the progesterone receptor, estrogen receptor, and the Her2/neu receptor. A total of 75 (75%) of the patients had distant metastases, compared to 25 (25%) of the patients who did not.

Biochemical Characteristics of Healthy Controls and Breast Cancer Patients
As reported in Table 2, the age at inclusion was comparable in both the patient and control groups with the mean age of~27 years. Sex hormone levels for progesterone follicle stimulating hormone, luteinizing hormone (LH), and testosterone were significantly different, whereas no significant difference was indicated for estradiol levels between the patients and controls. The serum lipid profile for HDL, LDL, total cholesterol, and triglycerides also showed significant differences between the patients and controls.

Hardy-Weinberg Equilibrium
There was no deviation from the Hardy-Weinberg equilibrium for the control group in the genotype distributions and allele frequencies of the SNPs located in the genes for four gene SNPs for KLF 14 (rs972283 C > T) gene polymorphism HWE is (χ 2 = 070 p < 0.79), for MDM4 rs11801299 A > G HWE is (χ 2 = 0.190 p < 0.66), for miRNAs 27a rs895819A > G HWE is (χ 2 = 0.0358 p < 0.849) and for miR-196a-2 rs11614913 C > T HWE is (χ 2 = 1.928 p < 0.164). As a result, we randomly selected 10% of the samples from the normal control group to review the genotyping results, demonstrating that the accuracy rate was greater than 99%.

T Genotypes in Breast Cancer Patients and Gender Matched Controls
Breast cancer patients were more likely to have the frequencies of AKT-1 (rs1130233 G > A genotypes GG (50%), GA (40%), and AA (10%), compared to gender matched controls who were more likely to have frequencies of GG (78.43%), GA (19.60%), and AA (1.90%), respectively (Table 3). A statistically significant difference in the AKT-1 (rs1130233 G > A genotypes was seen between breast cancer patients and healthy controls (p < 0.0001). Additionally, it was discovered that breast cancer patients had a higher frequency of the A allele than healthy controls (0.30 vs. 0.12) ( Table 3).
The frequencies of PI3K rs121913281 in breast cancer cases was CC (26.54%), CT (51.32%), and TT (22.12%), compared to gender matched controls who were more likely to have frequencies of CC (16.66%), CT (79.41%), and TT (3.92%), respectively (Table 3). A statistically significant difference in the PI3K C > T genotypes was seen between breast cancer patients and healthy controls (p < 0.0001). Additionally, it was discovered that breast cancer patients had a higher frequency of the T allele than healthy controls (0.48 vs. 0.43) (Table 3).
Similarly, the frequencies of KLF 14 rs972283 G > A genotypes in case were GG (29.95%), GA (39.13%), and AA (33.91%), compared to gender matched controls who were more likely to have frequencies of GG (46.86%), GA (45.21%), and AA (13.91%), respectively. A statistically significant difference in the KLF 14 rs972283 G > A genotypes was seen between breast cancer patients and healthy controls (p = 0.002). Additionally, it was discovered that breast cancer patients had a higher frequency of the A allele than healthy controls (0.53 vs. 0.37) ( Table 3). The MDM4 rs11801299 G > A genotype frequency was GG (62.60%), GA (34.78%), and AA (2.60%) in breast cancer patients and controls, respectively, and GG (78.26%), GA (20.86%), and AA (0.86%) in controls. The MDM4 s rs11801299 G > A SNP was statistically significant (p < 0.03) between breast cancer patients and controls. Additionally, it was discovered that breast cancer patients had a higher frequency of the A allele than healthy controls (0.20 vs. 0.11) ( Table 3).

Logistic Regression Analysis of PI3K rs121913281 C > T Genotypes to Predict the Risk of Breast Cancer
Our findings indicated a protective association between the PI3K rs121913281 CT genotype and breast cancer susceptibility in the codominant model, with an OR of 0.40 (95%) CI = 0.204-0.804, RR = 0.62 and p < 0.009 (Table 5), whereas the PI3K rs121913281 TT genotype was strongly linked to breast cancer susceptibility with an OR of 3.54, (95%) CI = 1.0544-11.896, RR = 2.62 and p < 0.040. A non-significant association was reported between the PI3K -CC and (PI3K -CT + TT) genotypes with the breast cancer susceptibility with an OR = 0.55, (95%) CI = 0.28-1.078, RR = 0.71 and p < 0.081 in the dominant inheritance model. A strong correlation was reported in the recessive inheritance model between the PI3K -(CC + CT) and PI3K -TT genotypes and breast cancer susceptibility with an OR = 6.96, (95%) CI = 2.33-20.78, RR = 3.81, and p < 0.0005 (Table 6). With an OR = 1.18, 95%CI = 0.80-1.73, RR = 1.09, and p = 0.387, the PI3K T allele was not associated with breast cancer susceptibility in allelic comparison.

Logistic Regression Analysis of KLF 14 rs972283 G > A Genotypes to Predict the Risk of Breast Cancer
To estimate the relationship between the KLF 14 rs972283 C > T genotypes and the risk of breast cancer, a multivariate analysis using a logistic regression calculated the odds ratio (OR) and the risk ratio (RR) with 95% confidence intervals (CI) for each group. The data are summarized in Table 6. Our findings demonstrated a strong association between the KLF 14-AA genotype and increased breast cancer patient susceptibility in the codominant model, with an OR = 3.69 (CI = 1.7672-7.7282), RR = 2.07 (CI = 1.3204-3.2493), and p < 0.0005. While the KLF14-GA genotype was not linked to breast cancer susceptibility, with an OR = 1.31 (CI = 0.7171-2.4004), RR = 1.12 (CI = 0.8681-1.4554), and p < 0.37. In the dominant inheritance model, there is a significant association between the KLF14-GG and KLF14-(GA + AA) genotypes and leads to increased breast cancer susceptibility with OR = 1.87 (CI = 1.07-3.26), RR = 1.87 (CI = 1.0465-1.7336), and p < 0.026. In the recessive inheritance model, there is a strong correlation between the KLF14 (GG + GA) and KLF14-AA genotypes that increases breast cancer susceptibility OR = 3.17 (CI = 1.6507-6.1076), RR = 1.94 (CI = 1.2618-2.9971), and p < 0.0005 (Table 6). In the context of allelic comparison, the KLF14-A allele was found to be strongly associated with breast cancer susceptibility, as indicated by an OR = 1.99 (CI = 1.3759-2.9015), RR = 1.42 (CI = 1.1693-1.7295), and p < 0.0003. In the over dominant inheritance model, there was no association observed between the KLF14 GA and KLF14-AA + GG genotypes; OR = 1.28 (CI = 0.7599-2.1694), RR = 1.13 (CI = 0.8747-1.4643), and p < 0.350 (Table 6).

Logistic Regression Analysis of MDM4 rs11801299 G > A Genotypes to Predict the Risk of Breast Cancer
Our findings showed a strong association between the MDM4-GA genotype and  (Table 7).

Logistic Regression Analysis of miRNAs 27a rs895819 A > G Genotypes to Predict the Risk of Breast Cancer
Our findings showed a strong association between the miR27a-GG genotype and increased breast cancer patient susceptibility in the codominant model, with an OR = 2.  (Table 8).

Correlation KLF-14 rs972283 A > G Genotypes with Clinicopathological Features of Cases
The Krüpple-like Transcription Factor KLF-14 rs972283 G SNP was not related to age status in BC (p = 0.83) (Table 10); however, a strong association was reported with respect to breast cancer staging (p = 0.006). Similarly, a strong association was reported with respect to the estrogen receptor status (p = 0.002), the progesterone receptor status (p = 0.027), and the Her2/neu receptor status (p < 0.02). However, the histological grade of BC was not correlated (p = 0.90) with the KLF-14 rs972283 A > G SNP (Table 10). The Her2/neu receptor status (p = 0.02), the progesterone receptor status (p = 0.027), and the KLF-14 rs972283 A > G SNP all showed significant correlations. Additionally, the KLF-14 rs972283 A > G SNP was significantly associated with the status of receiving either herceptin (p = 0.020) or tamoxifen (p = 0.002) treatment.

Correlation of MDM4 rs11801299 G > A with Clinical Features of the Cases
The MDM4 rs11801299 G > A SNP was not associated with age status in BC (p < 0.46) (Table 11). A strong association was indicated with the staging of BC patients (p < 0.0005). While the MDM4 rs11801299 G > A SNP was correlated with the histological grade of BC (p < 0.03) (Table 11), it was significantly associated with the Her2/neu receptor status (p < 0.008) but not associated with ER status (p < 0.38) or PR status (p < 0.49). In addition, the MDM4 rs11801299 G > A SNP was associated with herceptin treatment (p = 0.020), but not with the tamoxifen treatment status (p = 0.47).

Correlation of miRNAs 27a rs895819 A > G Genotypes with Clinical Features of the Cases
Age status in BC was significantly correlated with the MiR-27a rs895819 A > G SNP (p < 0.0005) ( Table 12). The stage of BC was significantly correlated with the MiR-27a rs895819 A > G SNP (p = 0.0006); however, the histological grade of BC was not linked to the MiR-27a rs895819 A > G SNP (p = 0.87) ( Table 12). The status of progesterone and Her2/neu receptors were not significantly correlated with the MiR-27a rs895819 A > G SNP (p = 0.55 and p = 0.07, respectively). Furthermore, the MiR-27a rs895819 A >G SNP was not associated with the presence of a tamoxifen treatment (p = 0.090) but was associated with herceptin treatment (p = 0.005).

Correlation of miR-196a-2 rs11614913 C > T Genotypes with Clinical Features of Cases
The MiR-196aC > T variation was not correlated with age status in BC (p < 0.11) ( Table 13). The miR-196aC > T SNP was significantly correlated with the staging of BC (p = 0.007). However, the miR-196a C > T polymorphism was not correlated with the histological grade of BC patients (p < 0.78) (Table 13). Regarding receptor status, the miR-196a C > T SNP was significantly associated with the Her2/neu receptor status (p < 0.006) but not with estrogen receptor status (p = 0.41) and progesterone receptor status (p = 0.305). In addition, the miR-196a-2 rs11614913 C > T SNP was associated with herceptin treatment (p = 0.006) but not with the tamoxifen treatment status (p = 0.39). The MiR-196a C > T SNP was strongly associated with the distant metastasis status in BC patients (p < 0.017 (Table 13).

Disease Progression with Respect to Mutation or Gene Polymorphisms
Overall, due to the increasing survival rates of breast cancer patients, longer follow-up studies are required to shed light on prognostic data. In our study, there were statistically significant differences in disease-free survival according to the gene polymorphisms (p > 0.05). There was a statistically significant difference in OS (overall survival) at 6 years according with some selected gene variants like PI3K rs121913281 C > T, AKT1 rs1130233 G > A, miRNAs 27a rs895819 A > G, miR-196a-2 rs11614913 C > T with lower survival in patients with polymorphic genotypes (p < 0.05) and with HR (hazard ratio) of above 2.30. This difference was independent of adjuvant treatment with endocrine therapy. The figures has shown the cumulative and overall survival by the Kaplan-Meier method, log-rank test, and Cox regression method of breast cancer patients according to gene polymorphism. The Kaplan-Meier curve method with respect to PI3K rs121913281 C > T gene polymorphism is depicted in Figure 4A, For AKT1 rs1130233 G > A gene polymorphism is shown in Figure 4B, for miRNAs 27a rs895819 A > G gene polymorphism is depicted in Figure 4C, and for the miR-196a-2 rs11614913 C > T gene polymorphism is Figure 4D, log-rank test (p < 0.05). The genotypes did not influence patient age, stage at diagnosis, or tumor grade.

Discussion
Breast cancer (BC) is the most common cancer in women and one of the important death causes in women all over the world in including Saudi Arabia. In this study we have investigated if some SNPs of differences in the PI3K-AKT signalling pathway genes, KLF 14 (rs972283 C > T), MDM4 rs11801299 A > G, miRNAs 27a rs895819A > G and miR-196a-2 rs11614913 C > T genes affect the pathogenesis in Saudi population.

Comparative Analysis of Phosphoinositide 3-Kinase (PI3K)/AKT Pathway in Breast Cancer
Phosphoinositide 3-kinase (PI3K)/AKT/mTOR pathway gene variations are typically discovered in breast tumours and are linked to cellular change, carcinogenesis, cancer progression, and medication resistance. Breast cancer tumour tissue reveals AKT1 and MTOR mutations [49,50]. Therefore, studying genetic polymorphisms in the mTOR pathway may shed light on connections between obesity and breast cancer risk. The mTOR pathway may be crucial in the development of breast cancer. Few studies have looked at the relationship between cellular factors that are common in breast cancers and the mTOR pathway's genetic variation and breast cancer risk, and subtypes and only a small number of single-nucleotide polymorphisms (SNPs) have been examined [51,52].
AKT is also called protein kinase B. It is a serine-threonine kinase that functions as a mediator of PI3K-Akt-mTOR signaling pathway. AKT has important role in an array of cellular processes. Many single nucleotide polymorphisms (SNP) in AKT gene have been observed to be associated with various types of cancers including breast cancer [53,54]. Our findings reported a strong association between the AKT1 rs1130233-GA genotype with the breast cancer susceptibility in the codominant model, with an OR of 3.20 (95%) CI = (1.6829 to 6.084), RR = 1.84 and p< 0.0004 and AKT1 rs1130233-AA genotype was strongly linked to breast cancer susceptibility with an OR of 3.20 (95%) CI = (1.682 to 6.084), RR = 3.69 and p < 0.044. (Table 4).
Studies have reported that PI3K-AKT-mTOR pathway can be associated in the development of Brain or central nervous system (CNS) metastasis from breast cancer patients [55]. PI3KR1-rs706716 has been reported to be strongly associated with CNS metastasis in metastatic breast cancer patients and may be included in a predictive composite score to detect early Brain metastasis irrespective of breast cancer subtype. Similarly, PI3K rs121913281 C > T showed a strong correlation between the recessive inheritance model and breast cancer susceptibility with an OR = 6.96 and for TT gene in codominant model with OR 3.54. A strong correlation was reported in the recessive inheritance model between PI3K (CC + CT) and PI3K -TT genotypes and breast cancer susceptibility with an OR = 6.96, (95%) CI ((2.33to 20.78), RR = 3.81, and p < 0.0005) ( Table 5). The PI3K/AKT/mTOR and RAF/MEK/ERK pathways have been indicated to be activated by mutations and chromosomal translocation in the vital targets. Research studies has shown that PI3K/AKT/mTOR signaling pathway are dysregulated in most of the malignancies including breast carcinogenesis [56] 4.2. Comparative Analysis of KLF 14, MDM4 in Breast Cancer KLF 14 messenger RNA is significantly decreased in many types of human malignancies, such as breast cancer and colorectal carcinoma [57,58]. The frequencies of MDM4 rs11801299 G > A and KLF-14 rs972283 A > G genotypes determined in the cases of breast cancer and gender matched healthy controls (Table 3).Results showed that the AA genotype of the KLF-14-rs972283 A > G was associated with BC (Table 3), Moreover, the results indicated that the KLF-14 (rs972283 A > G) SNP genotype was significantly different in cases at early sate and cases at advance stage (Table 10). One of the study reveal that KLF14 reduction serves as a mechanism leading to centrosome amplification and tumorigenesis and further indicated that KLF14 serves as a tumour suppressor and reported its potential as biomarker and therapeutic target for several malignancies [23,59]. KLF14 inhibits the progression of cervical cancer by targeting ITGB1 via the PI3K/AKT signalling pathway [60]. In one of the studies reported that the whole body loss of KLF14 function in male mice does not result in metabolic abnormalities as assessed under chow and HFD conditions and concluded that there is redundancy for the role of KLF14 in the mouse and a diverging function in human malignancies [61]. It was reported that KLF14/miR-1283 signaling is downregulated cell proliferation in HER2+ breast cancer. Also it was suggested that the KLF14/miR-1283/TFAP2C axis inhibited HER2+ breast cancer progression, which might provide novel insight into mechanical exploration for this disease [62]. Results also indicated that there was significant difference in the KLF-14 (rs972283 A > G) SNP genotype between cases with positive and cases with negative oestrogen receptor (Table 10). In addition, KLF-14 rs972283 A > G SNP was significantly associated with the tamoxifen and Herceptin treatment status (Table 10).
These results are consistent with the study reported that KLF-14 is downregulated in brain cancer and colorectal cancer [59,61]. In addition, the KLF14 suppresses the cell growth and transformation of the cell mediated by oncogene KRAS [50]. The down regulation of the KLF14 in cancers suggested that it has an anti-tumour role [58]. Moreover, loss of KLF leads to amplification of centrosome and tumour progression [57]. Recently, it was shown that KLF14 suppresses the cervical cancer progression by reduction of the cell proliferation and enhancement of the apoptosis through integrin β1 (ITGB1) via the PI3K/AKT signalling pathway [59,60]. KLF14 and the KLF-14 (rs972283 A > G) roles in the breast cancer development remain to be investigated in future studies.
MDM4 negatively regulates of P53 that imposes negative feedback loop, and the influence of the MDM4 gene variations on cancer development was studied by different groups [63]. Nevertheless, some studies conflicting results [63,64]. MDM4 suppresses the transcriptional activity of p53 inhibition, and with MDM2 regulate the degradation of p53. Gene variations in MDM4 were reported to be associated with risk to cancer [65,66]. MDM4 rs11801299 G > A was associated with breast cancer predisposition in this study (Tables 3 and 7). The MDM4 rs11801299 GA genotype and A allele were associated with increased risk to the breast cancer (Table 7). Results showed that the MDM4 rs11801299 G > A genotype was significantly different in cases in early and grade I stage and cases in advanced and grade II stage of breast cancer (Table 11). Furthermore, the MDM4 rs11801299 G > A SNP genotype distribution was associated with the Herceptin treatment status (Table 11). These results are inconsistent with the result of some previous studies reported that the MDM4 rs11801299 SNP is not associated with cancer [19,67,68]. However, our result may be in partial agreement with a study reported that the rs11801299 was significantly associated with risk to retinoblastoma in Chinese population [69]. The suppression of the MDM4 has been suggested as a therapeutic strategy for the reactivation of the p53 in hepatoblastoma [69]. The role of the SNPs on the structure and the functions of the MDM4 remain to be elucidated in further protein-protein interaction studies [69,70].

Comparative Analysis miRNAs 27a and miR-196a
The GG genotype of the miRNAs 27a rs895819 A > G was associated with risk to breast cancer in this study (Table 3). MiR-27a rs895819 A > G SNP was strongly associated with cases age status in BC and the staging of BC (Table 12). Moreover, MiR-27a rs895819 A > G SNP was associated with the Herceptin treatment (Table 12). Previous studies to investigate the possible association of rs895819 SNP with the risk of breast cancer have been conducted in different populations [71]. The result however did not lead to a clear conclusion on whether there is a clear association. Yang et al., 2010 [72] suggested the protective role of the G allele rs895819 in breast cancer risk in the German populations However, this protective role of this SNP was not found in the Italian population [68]. In the Chinese population, Zhang et al., 2012 [73] suggested that there was no association between this SNP and breast cancer risk. However, in another study also in the Chinese population the G allele of rs895819 was reported to confer a protective role in the younger population [74]. The miR-27a enhances proliferation of the tumor cell by inhibition the AKT or tyrosine signaling pathways [75]. More studies can be performed with increased sample sizes and in different ethnicities to better understand the association of this SNP with breast cancer. Our results indicated that the miR-196a-CT genotype of rs11614913 was strongly associated with increased breast cancer susceptibility (Tables 3, 9 and 13). Results showed that the miR-196a-rs11614913 C > T SNP was associated with the staging of breast cancer and Her2/neu receptor status (Table 13). This result is consistent with previous studies that reported the association of this SNP with cancer [74]. The C allele of the miR-196ars11614913 has been reported to increase the expression of the miR-196a-2 [73,74]. Moreover, this result is also consistent with a study reported increased expression of miR-196a in breast cancer tissues in comparison to normal tissues and that this increased miR-196a-2expression is associated with the increased stage of the breast cancer [76,77]. Jiang et al. [78] has reported that the estrogen causes up regulation of miR-196a in breast cancer cells that are estrogen receptor positive. The miR-196a up regulation by estrogen enhances the development of breast cancer via targeting the SPRED1 [78]. Since the C allele of the miR-196a rs11614913 enhances the expression of the miR-196a [73]. Our results may be consistent with the previous study [78], despite the fact that there was no association with estrogen receptor status (Table 13). This may be due to the relatively limited sample size used in this study which is one of the limitations. The rapidly evolving technology of whole exome sequencing has made it possible to analyze genomic characteristics of tumor samples at an unprecedented speed.

Whole Exome Sequencing in Breast Cancer
Detecting pathogenic intronic variants resulting in aberrant splicing remains a challenge in routine genetic testing. Several whole exome sequencing studies reported frequently mutated genes in breast cancer such as PIK3CA (13-42%), TP53 (43-75%), ARID1A (10-21%), RB1 (11-16%) and PTEN (11-15%) BRCA2 (13-26%) [79]. WES is a promising technology for developing biomarkers to be used in the clinic to better select patients for specific therapies [46,48]. Similarly in case of our study the most common BRCA1 gene variants identified by whole exome sequencing in our breast cancer cases were BRCA1 rs799917 C > T/c.2612C > T/p.Pro871Leu and BRCA1 rs1799966 T > C/c.4900A > G/p.Ser1634Gly/p.S1634G, BRCA1 rs1799949 G > A, c.2082C > T/p.Ser694/p.S694, BRCA1 rs1060915 A > G/c.4308T > C/p.Ser1436/p.S1436 and BRCA1 rs16940 A > G/c.2311T > C/p.Leu771/p.L771. (Supplementary Tables S2 and S3) Whole-exome sequencing may improve the risk assessment and to provide insight toward disease mechanisms for the development of more effective therapies. These results encourage conducting further studies using a large sample size, using different ethnic groups, and using large clinical trials for developing targeted therapies that could benefit breast cancer patients.

Conclusions
It was concluded that a strong association between the PI3K-AKT signaling pathway gene variants with the breast cancer susceptibility and progression. Similarly, KLF 14-AA, MDM4-GA, miR27a-GG and miR-196a-CT gene variants were associated with the higher risk probability of BC and were strongly correlated with staging of the BC patients. This study also reported Low, novel, and intermediate-genetic-risk variants of PI3K, AKT-1, MDM4G & KLF-14 by utilizing whole-exome sequencing. These variants should be further investigated in larger cohorts' studies. Whole-exome sequencing is critical for improved risk assessment and to provide insight toward disease mechanisms for the development of more effective therapies. These results encourage conducting further studies using a large sample size and in different populations.

Data Availability Statement:
We have included the data associated with the study in the manuscript. In case of specific queries, the corresponding authors can be contacted.